Unsupervised Learning of P NP P Word Combinations
نویسندگان
چکیده
We evaluate the possibility to learn, in an unsupervised manner, a list of idiomatic word combinations of the type preposition + noun phrase + preposition (P NP P), namely, such groups with three or more simple forms that behave as a whole lexical unit and have semantic and syntactic properties not deducible from the corresponding properties of each simple form, e.g., by means of, in order to, in front of. We show that idiomatic P NP P combinations have some statistical properties distinct from those of usual idiomatic collocations. In particular, we found that most frequent P NP P trigrams tend to be idiomatic. Of other statistical measures, log-likelihood performs almost as good as frequency for detecting idiomatic expressions of this type, while chi-square and point-wise mutual information perform very poor. We experiment on Spanish material.
منابع مشابه
Towards the Automatic Learning of Idiomatic Prepositional Phrases
The objective of this work is to automatically determine, in an unsupervised manner, Spanish prepositional phrases of the type preposition nominal phrase preposition (P−NP−P) that behave in a sentence as a lexical unit and their semantic and syntactic properties cannot be deduced from the corresponding properties of each simple form, e.g., por medio de (by means of), a fin de (in order to), con...
متن کاملA Flexible Unsupervised PP-Attachment Method Using Semantic Information
In this paper we revisit the classical NLP problem of prepositional phrase attachment (PPattachment). Given the pattern V −NP1−P−NP2 in the text, where V is verb,NP1 is a noun phrase, P is the preposition and NP2 is the other noun phrase, the question asked is where does P −NP2 attach: V or NP1? This question is typically answered using both the word and the world knowledge. Word Sense Disambig...
متن کاملWeb-Based Model for Disambiguation of Prepositional Phrase Usage
We explore some Web-based methods to differentiate strings of words corresponding to Spanish prepositional phrases that can perform either as a regular prepositional phrase or as idiomatic prepositional phrase. The type of these Spanish prepositional phrases is preposition–nominal phrase–preposition (P−NP−P), for example: por medio de ‘by means of’, a fin de ‘in order to’, con respecto a ‘with ...
متن کاملSpectral Unsupervised Parsing with Additive Tree Metrics
We propose a spectral approach for unsupervised constituent parsing that comes with theoretical guarantees on latent structure recovery. Our approach is grammarless – we directly learn the bracketing structure of a given sentence without using a grammar model. The main algorithm is based on lifting the concept of additive tree metrics for structure learning of latent trees in the phylogenetic a...
متن کاملIntegrating Semantic Frames from Multiple Sources
Making senses : bootstrapping sense-tagged lists of semantically-related words p. 13 Enriching wordnets with new relations and with event and argument structures p. 28 Experiments in cross-language morphological annotation transfer p. 41 Sentence segmentation model to improve tree annotation tool p. 51 Markov cluster shortest path founded upon the alibi-breaking algorithm p. 55 Unsupervised lea...
متن کامل